Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix early-data test #132

Merged
merged 5 commits into from
Feb 19, 2023
Merged

Fix early-data test #132

merged 5 commits into from
Feb 19, 2023

Conversation

divergentdave
Copy link
Contributor

The domain name this test uses was not updated when the test certificate was replaced previously. FWIW, this test isn't covered by CI, since it's behind an off-by-default feature flag.

@quininer
Copy link
Member

Would you like to open it on the ci by the way?

@divergentdave
Copy link
Contributor Author

Sure, I added another command to run that test.

@quininer
Copy link
Member

The test seems to fail on windows, maybe there is a problem with openssl on windows ci machine. we can disable it on windows ci.

run: cargo test --all
run: |
cargo test --all
cargo test -p tokio-rustls --features early-data
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can add --test test_0rtt so that it doesn't need to repeat the test

@divergentdave
Copy link
Contributor Author

I ran a modified version of the test on both Linux and Windows, and it seems that this might be due to cross-platform differences in OpenSSL, but I can't rule out an issue in rustls yet. In both cases, I do see "hello" sent in application data. On Linux, the server sends two session tickets (nonce zero before the client's days and nonce one after the client's data), but I don't see any session tickets on Windows. In both cases, I see a client-to-server close notify alert, and I see a FIN from the client. However, I do not see a FIN from the server on Windows, only Linux. Even more suspicious, OpenSSL's key log is missing CLIENT_TRAFFIC_SECRET_0, compared to the key log file from rustls. The connection works enough to get data through in both directions, but there seems to be something wrong with higher-level connection state machines.

With a base set of changes, I see that openssl s_client stdout does not get as far on Windows as on Linux. I see only two lines printed, "Using default temp DH parameters", and "ACCEPT". Interestingly, if I additionally write a line to openssl s_server's stdin, I see that come out rustls's read end of the connection, and then I see a third line from openssl s_server's stdout, "helloNo early data received". (which is expected) The test still deadlocks waiting for the server to close its end of the TLS connection.

This suggests one side or the other may be forgetting to read from a pipe or socket when it should, or forgetting to wake a future appropriately. (I added a missing wake to a Pending control flow in Read1, and that was not sufficient, for what it's worth)

@divergentdave
Copy link
Contributor Author

I have some changes in a branch at https://github.com/divergentdave/tokio-tls/tree/early-data-debugging, and I think the results strongly suggest an I/O handling bug in s_server on Windows. The most recent change I applied was writing to the s_server process's stdin repeatedly, with some delay. These writes (which ultimately cause the server to send data to the client) do get the connection unstuck multiple times, and the test eventually passes with them. OpenSSL's debug output, plus my modified test's output, shows the following.

$ cargo test -p tokio-rustls --features early-data --test early-data -- --nocapture
...
OpenSSL s_server stdout: "write to 0x22949068920 [0x229490c8500] (2176 bytes => 2176 (0x880))"
OpenSSL s_server stdout: "0000 - 16 03 03 00 7a 02 00 00-76 03 03 6c c3 98 ab f2   ....z...v..l...."
OpenSSL s_server stdout: "0010 - bd 30 bc 17 ec 9e 9a 3e-58 71 47 5a a6 6a 5b a0   .0.....>XqGZ.j[."
...
OpenSSL s_server stdout: "05a0 - b8 a8 31 b6 8e 24 9f ca-36 b4 9b 63 6a a3 22 8e   ..1..$..6..cj.\"."
OpenSSL s_server stdout: "05b0 - 90 e3 19 32 1f 0a c3 d2-ae a3 6f ff 3c ad 72 6b   ...2......o.<.rk"
Writing to OpenSSL s_server stdin
OpenSSL s_server stdout: "05c0 - 2a 38 23 02 b4 5f 28 36-91 63 20 14 be a6 f4 65   *8#.._(6.c ....e"
OpenSSL s_server stdout: "05d0 - 5f 77 10 4c f3 1e fb df-2b c3 ae a2 a9 5c 27 ab   _w.L....+....\\'."
...
OpenSSL s_server stdout: "0860 - f6 4c e2 9d 8d 52 37 17-58 21 6f 97 a2 5a 19 ee   .L...R7.X!o..Z.."
OpenSSL s_server stdout: "0870 - 15 b4 8c b8 33 43 23 cc-4c ed 1a aa 06 a5 f6 a1   ....3C#.L......."
OpenSSL s_server stdout: "No early data received"
...

Here, we see that OpenSSL intended to write 0x880 bytes, but only wrote about half of it at first. Immediately after the test wrote a newline to OpenSSL's stdin, it finished sending out this write of handshake data. Looking at the packet capture, the first packet sent after the first delay, corresponding to the first network event after the stdin write, is a new session ticket message sent from the server to the client. Thus, I think we can rule out misbehavior on the part of tokio-rustls (for example, failing to drain a TCP connection buffer) because the connection gets unstuck without any interaction with the Rust side (ignoring the stdin write itself). On the other hand, running openssl s_server -early_data -tls1_3 -cert ./tests/end.cert -key ./tests/end.rsa -port 12354 and openssl s_client -connect 127.0.0.1:12354 -sess_out session.pem together doesn't run into the same issue, so there must be some way Tokio or rustls affects this, even if it's just preconditions for triggering a bug.

@divergentdave divergentdave changed the title Fix domain name in early-data test Fix early-data test Feb 18, 2023
This is necessary to work around an issue that only appears on Windows.
@quininer
Copy link
Member

Thank you for your meticulous research!

@divergentdave divergentdave deleted the patch-1 branch March 19, 2023 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants